Methods and systems for single cell gene profiling

ABSTRACT

This disclosure provides methods and systems for single-cell analysis, including single-cell transcriptome analysis, of target cells without microfluidic devices. The disclosed methods involve the use of template particles to template the formation of monodisperse droplets to generally capture a single target cell from a population of cells in an encapsulation, derive a plurality of distinct mRNA molecules from the single target cell, and quantify the distinct mRNA molecules to generate an expression profile.

TECHNICAL FIELD

This disclosure relates to methods and systems for single cell geneprofiling.

BACKGROUND

The complexity of biological systems necessitates many experiments tocharacterize them. High-throughput methods are often implemented toreduce the number of individual experiments that need to be performed.Unfortunately, methods for high-throughput analysis of single cells areconstrained by costs associated with isolating single cells andpreparing libraries.

Methods for isolating single cells generally require microfluidicdevices that are complicated to use and expensive to operate. Moreover,since cells are processed individually, microfluidic devices areinherently limited in terms of the number of cells that can be assayedin a given experiment. And as such, high-throughput single cell systemsare unavailable in many clinical and research facilities.

SUMMARY

This disclosure provides methods and systems for single-cell analysis,including single-cell transcriptome analysis, of target cells withoutmicrofluidic devices. Methods and systems of the invention generate anemulsion with template particles to segregate individual target cellsinto monodisperse droplets. Nucleic acid molecules are released from thetarget cells inside the monodisperse droplets and are quantified togenerate expression profiles for each of the target cells. This approachprovides a massively parallel analytical workflow that is inexpensiveand scalable to ascertain expression profiles of millions of singlecells with a single library preparation.

Methods and systems of the invention use template particles to templatethe formation of monodisperse droplets and isolate target cells for geneprofiling. Methods include combining template particles with targetcells in a first fluid, adding a second fluid to the first fluid,shearing the fluids to generate a plurality of monodisperse dropletssimultaneously wherein each of the monodisperse droplets contain asingle one of the template particles and a single one of the targetcells. Methods further include lysing the target cells within themonodisperse droplets to release a plurality of distinct mRNA moleculesand quantifying the plurality of distinct mRNA molecules. Data generatedby quantifying the mRNA is used to create expression for each of thetarget cells. Methods further include processing the expression profilesto identify characteristics of the target cells that can be used to, forexample, make a diagnosis, prognosis, or determine drug effectiveness.

Methods and systems of the invention provide a method for quantifyinggene expression of target cells. The method includes releasing mRNA fromtarget cells inside monodisperse droplets. The mRNA may be reversetranscribed into cDNA and simultaneously barcoded. The barcoded cDNA isamplified to generate a plurality of barcoded amplicons. The ampliconscan be sequenced by next generation sequencing methods, and because ofthe barcodes, each sequence read can be traced back to the target cell.The sequence reads are processed by certain computer algorithms togenerate an expression profile for the target cell.

After obtaining expression profiles from target cells, the profiles maybe analyzed by comparing the profiles with reference or control profilesto ascertain information about the target cells. In other instances,profiles of target cells can be compared to profiles derived from cellswith certain phenotypes to determine whether the target cells sharecharacteristics of the cells of the phenotype.

In one aspect, methods and systems of the invention provide a method foridentifying the presence of a rare cell in a heterogeneous cellpopulation. The method includes isolating a plurality of target cells bycombining target cells with a plurality of template particles in a firstfluid, adding a second fluid that is immiscible with the first fluid,and shearing the fluids to generate an emulsion comprising monodispersedroplets that contain a target cell and a single template particle.Methods further include releasing a plurality of mRNA molecules insidethe droplet containing the target cell and quantifying the plurality ofmRNA molecules. Quantifying may include reverse transcribing the mRNAinto cDNA that is barcoded. The barcoded cDNA may be amplified togenerate a plurality of barcoded amplicons that can be traced back tothe target cell. In some instances, methods may include sequencing theplurality of barcoded amplicons by, for example, next-generationsequencing methods to generate sequence reads. Methods may furtherinclude processing the sequence reads to generate expression profilesfor each target cell and using the data by, for example, performing agene clustering analysis to identify one or more cell types or cellstates among the target cells.

In another aspect, methods and systems of the disclosure provide amethod for analyzing a heterogeneous tumor biopsy taken from a subject.The method includes obtaining a biopsy from a patient and isolating apopulation of cells from the biopsy. The method further includessegregating the population of cells into droplets by creating a mixtureof the population of cells and a plurality of template particles anaqueous fluid, adding an oil, and vortexing the mixture to generate anemulsion comprising droplets that each contain a single one of thepopulation of cells and a template particle. Methods further includereleasing mRNA from each one of the cells inside droplets and performingtranscriptome analysis on one or more genes. The analysis of one or moregenes may be used to identify one or more characteristics of a cancer. Acharacteristic of cancer can be the presence, or absence, of one or moregene transcripts associated with cancer. A method disclosed herein canfurther comprise the step of using the characteristic to diagnose thesubject with cancer and devise a treatment plan.

In some aspects, methods and systems of the invention provide a methodfor determining the potential effectiveness of a therapeutic agent. Themethod comprises segregating a first population of diseased cells intodroplets with template particles and determining gene expression from atleast one of the diseased cells, thereby producing a disease-stateexpression signature. The method further includes exposing a secondpopulation of disease state cells to an agent and determining geneexpression of second population of cells and comparing the geneexpression with the disease-state expression signature to ascertain theeffectiveness of the agent against the disease based on an elevated orrepressed level of expression of one or more genes. In some embodiments,the therapeutic agent may be delivered to second population of cellsinside the droplets. For example, the agent may be associated with thetemplate particle by tethering the agent to an external surface of thetemplate particle, or packaging the agent inside a compartment of thetemplate particle and releasing the agent from the template particleinside the droplets.

In certain aspects, the methods and systems of the invention provide amethod for segregating cells into droplets. The droplets may be preparedas emulsions, e.g., as an aqueous phase fluid dispersed in an immisciblephase carrier fluid (e.g., a fluorocarbon oil, silicone oil, or ahydrocarbon oil) or vice versa. Generally, the droplets are formed byshearing two liquid phases. Shearing may comprise any one of vortexing,shaking, flicking, stirring, pipetting, or any other similar method formixing solutions. Methods of the invention include combining cells withtemplate particles in a first fluid, adding a second fluid, and shearingor agitating the first and second fluid. Preferably, the first fluid isan aqueous phase fluid, and, in some embodiments, may comprise reagentsselected from, for example, buffers, salts, lytic enzymes (e.g.proteinase k) and/or other lytic reagents (e. g. Triton X-100, Tween-20,IGEPAL, or combinations thereof), nucleic acid synthesis reagents e.g.nucleic acid amplification reagents or reverse transcription mix, orcombinations thereof.

Methods and systems of the invention use template particles to templatethe formation of monodisperse droplets and isolate target cells.Template particles according to aspects of the invention may comprisehydrogel, for example, selected from agarose, alginate, a polyethyleneglycol (PEG), a polyacrylamide (PAA), acrylate, acrylamide/bisacrylamidecopolymer matrix, azide-modified PEG, poly-lysine, polyethyleneimine,and combinations thereof. In certain instances, template particles maybe shaped to provide an enhanced affinity for target cells. For example,the template particles may be generally spherical but the shape maycontain features such as flat surfaces, craters, grooves, protrusions,and other irregularities in the spherical shape that promote anassociation with the target cell such that the shape of the templateparticle increases the probability of templating a droplet that containsthe target cell.

In some aspects, methods and systems of the invention provide templateparticles that include one or more internal compartments. The internalcompartments may contain a reagent or compound that is releasable uponan external stimulus. Reagents contained by the template particle mayinclude, for example, cell lysis reagents or nucleic acid synthesisreagents (e.g., a polymerase). The external stimulus may be heat,osmotic pressure, or an enzyme. For example, in some instances, methodsof the invention include releasing a reverse transcriptase directlyinside of a droplet containing mRNA.

In some aspects, methods and systems of the invention provide a libraryprep method for analyzing a transcriptome of a single cell. Methodsinclude releasing mRNA from a single target cell contained inside adroplet. In some embodiments, the released mRNA attaches to a poly Tsequence of a barcoded capture probe attached to a template particle viacomplementary base pairing. Alternatively, the released RNA attaches toa gene-specific sequence of the barcoded capture probe. Followingattachment of the mRNA molecule with the capture probe, a reversetranscriptase synthetizes cDNA and thereby creates a first strandcomprising cDNA and the capture probe sequence. The mRNA molecule firststrand hybrid is then denatured using any method known in the art, suchas, exposure to a denaturing temperature. In a next step, a secondstrand primer comprising a random hexamer sequence anneals with thefirst strand to form a DNA-primer hybrid. A DNA polymerase synthesizes acomplementary second strand. In some instances, the second strand isamplified by, for example, PCR, to generate a plurality of ampliconswhich are analyzed to ascertain an expression profile of the singlecell.

In certain aspects, this disclosure provides a kit for single cellprofiling according to methods of the invention. The kit includestemplate particles comprising a plurality of capture sequences specificto one or more genes of interest. A researcher following instructionsprovided by the kit can use template particles to assay single cellexpression of specific genes of interest, such as, oncogenes. The kitcan allow for single cell profiling according to methods describedthroughout this disclosure (e.g., at FIG. 1). Template particles may becustom designed for the user's specific needs, for example, designed toinclude capture probe sequences specific to the certain genes ofinterest, such as oncogenes. The template particles may be shippedinside sample preparation tubes, or sample collection tubes, such as,blood collection tubes. The template particles are preferably in a driedformat. The kit may further include reagents, such as, cell lysisreagents, and nucleic acid synthesis reagents.

In other aspects, methods and systems of the invention provide a methodof collecting data regarding a transcriptome of a single cell. Themethod comprises the steps of releasing a plurality of distinct mRNAmolecules from single cells inside monodisperse droplets and collectingdata regarding a transcriptome of the single cells and sending the datato a computer. A computer can be connected to a sequencing apparatus.Data corresponding to the transcriptome can further be stored aftersending, for example the data can be stored on a computer-readablemedium which can be extracted from the computer. Data can be transmittedfrom the computer to a remote location, for example, via the internet.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 diagrams a method for single cell profiling.

FIG. 2 illustrates a droplet according to one aspect of the invention.

FIG. 3 illustrates a droplet following lysis of a target cell.

FIG. 4 illustrates the capture of mRNA.

FIG. 5 illustrates synthesis of cDNA to form a first strand.

FIG. 6 illustrates amplification of a first strand to generate anamplicon.

FIG. 7 illustrates a method for sequence-specific capture of mRNA.

FIG. 8 illustrates synthesis of cDNA to form a first strand.

FIG. 9 illustrates amplification of a first strand to generate anamplicon.

FIG. 10 illustrates the capture of mRNA according to TSO embodiments.

FIG. 11 shows a first strand following TS-PCR amplification.

DETAILED DESCRIPTION

This disclosure provides systems and methods of using template particlesto form monodisperse droplets for segregating single cells and preparinga library preparation thereof to profile expression of the single cells.The disclosed methods involve the use of template particles to templatethe formation of monodisperse droplets to generally capture a singletarget cell in an encapsulation, derive a plurality of distinct RNA fromthe singe target cell, and prepare a library of nucleic acids that canbe traced to the cell from which they were derived, and quantitatedistinct RNA to generate an expression profile of the single targetcell. Methods of the invention can be used to prepare libraries forsingle cell analysis of, for example, at least 100 cells, at least 1000cells, at least 1,000,000 cells, at least 2,000,000 cells, or more, froma single reaction tube.

FIG. 1 diagrams a method 101 for single cell profiling. The method 101includes combining 109 template particles with target cells in a firstfluid, and adding a second fluid that is immiscible with the first fluidto the mixture. The first fluid is preferably an aqueous fluid. Whileany suitable order may be used, in some instances, a tube may beprovided comprising the template particles. The tube can be any type oftube, such as a sample preparation tube sold under the trade nameEppendorf, or a blood collection tube, sold under the trade nameVacutainer. Template particles may be in dried format. Combining 109 mayinclude using a pipette to pipette a sample comprising cells and, forexample, the aqueous fluid into the tube containing template particlesand then adding a second fluid that is immiscible, such as oil.

The method 101 then includes shearing 115 the fluids to generatemonodisperse droplets, i.e., droplets. Preferably, shearing comprisesvortexing the tube containing the fluids by pushing the tube onto avortexer. After vortexing 115, a plurality (e.g., thousands, tens ofthousands, hundreds of thousands, one million, two million, ten million,or more) of aqueous partitions is formed essentially simultaneously.Vortexing causes the fluids to partition into a plurality ofmonodisperse droplets. A substantial portion of droplets will contain asingle template particle and a single target cell. Droplets containingmore than one or none of a template particle or target cell can beremoved, destroyed, or otherwise ignored.

The next step of the method 101 is to lyse 123 the target cells. Celllysis 123 may be induced by a stimulus, such as, for example, lyticreagents, detergents, or enzymes. Reagents to induce cell lysis may beprovided by the template particles via internal compartments. In someembodiments, lysing 123 involves heating the monodisperse droplets to atemperature sufficient to release lytic reagents contained inside thetemplate particles into the monodisperse droplets. This accomplishescell lysis 123 of the target cells, thereby releasing mRNA inside of thedroplets that contained the target cells.

After lysing 123 target cells inside the droplets, mRNA is released andsubsequently quantified 131. Quantifying 131 the mRNA generally requiressynthesizing cDNA to generate a library comprising cDNA with a barcodesequence to allow each library sequence to be traced back to the singlecell from which the mRNA was derived. In preferred embodiments, templateparticles isolated with the mRNA include a plurality of barcoded capturesequences that hybridize with target mRNA. After hybridization, cDNA issynthesized by reverse transcription. Reagents for reverse transcriptioncan be provided in variety of ways in a variety of formats. In someinstances, reagents and reverse transcriptase are provided by thetemplate particles. Once a library is generated comprising barcodedcDNA, the cDNA can be amplified, by for example, PCR, to generateamplicons for sequencing. Sequence reads are processed according tomethods described herein to accomplish the quantification of 131 mRNA.

In some aspects, the target cells may include live cells obtained from,for example, a sample (tissue of bodily fluid) of a patient. The samplemay include a fine needle aspirate, a biopsy, or a bodily fluid from thepatient. Upon being isolated from the sample, the cells may be processedby, for example, generating a single cell suspension with an appropriatesolution. Such solution will generally be a balanced salt solution, e.g.normal saline, PBS, Hank's balanced salt solution, etc., and in certaininstances supplemented with fetal calf serum or other naturallyoccurring factors, in conjunction with an acceptable buffer at lowconcentration, generally from 5-25 mM. Convenient buffers include HEPES(4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid), phosphate buffers,lactate buffers, etc. The separated cells can be collected in anyappropriate medium that maintains the viability of the cells, usuallyhaving a cushion of serum at the bottom of the collection tube. Variousmedia are commercially available and may be used according to the natureof the cells, including Dulbecco's modified eagle medium (dMEM), Hank'sbalanced salt solution (HBSS), phosphate buffered saline (PBS),Dulbecco's phosphate buffered saline (dPBS), Roswell Park MemorialInstitute medium (RPMI), Iscove's medium, etc., frequently supplementedwith fetal calf serum.

Methods and systems of the invention use template particles to templatethe formation of monodisperse droplets and isolate single target cells.The disclosed template particles and methods for targeted librarypreparation thereof leverage the particle-templated emulsificationtechnology previously described in, Hatori et. al., Anal. Chem., 2018(90):9813-9820, which is incorporated by reference. Essentially,micron-scale beads (such as hydrogels) or “template particles” are usedto define an isolated fluid volume surrounded by an immisciblepartitioning fluid and stabilized by temperature insensitivesurfactants.

The template particles of the present disclosure may be prepared usingany method known in the art. Generally, the template particles areprepared by combining hydrogel material, e.g., agarose, alginate, apolyethylene glycol (PEG), a polyacrylamide (PAA), Acrylate,Acrylamide/bisacrylamide copolymer matrix, and combinations thereof.Following the formation of the template particles they are sized to thedesired diameter. In some embodiments, sizing of the template particlesis done by microfluidic co-flow into an immiscible oil phase.

In some embodiments of the template particles, a variation in diameteror largest dimension of the template particles such that at least 50% ormore, e.g., 60% or more, 70% or more, 80% or more, 90% or more, 95% ormore, or 99% or more of the template particles vary in diameter orlargest dimension by less than a factor of 10, e.g., less than a factorof 5, less than a factor of 4, less than a factor of 3, less than afactor of 2, less than a factor of 1.5, less than a factor of 1.4, lessthan a factor of 1.3, less than a factor of 1.2, less than a factor of1.1, less than a factor of 1.05, or less than a factor of 1.01.

Template particles may be porous or nonporous. In any suitableembodiment herein, template particles may include microcompartments(also referred to herein as “internal compartment”), which may containadditional components and/or reagents, e.g., additional componentsand/or reagents that may be releasable into monodisperse droplets asdescribed herein. Template particles may include a polymer, e.g., ahydrogel. Template particles generally range from about 0.1 to about1000 μm in diameter or largest dimension. In some embodiments, templateparticles have a diameter or largest dimension of about 1.0 μm to 1000μm, inclusive, such as 1.0 μm to 750 μm, 1.0 μm to 500 μm, 1.0 μm to 250μm, 1.0 μm to 200 μm, 1.0 μm to 150 μm 1.0 μm to 100 μm, 1.0 μm to 10μm, or 1.0 μm to 5 μm, inclusive. In some embodiments, templateparticles have a diameter or largest dimension of about 10 μm to about200 μm, e.g., about 10 μm to about 150 μm, about 10 μm to about 125 μm,or about 10 μm to about 100 μm.

In practicing the methods as described herein, the composition andnature of the template particles may vary. For instance, in certainaspects, the template particles may be microgel particles that aremicron-scale spheres of gel matrix. In some embodiments, the microgelsare composed of a hydrophilic polymer that is soluble in water,including alginate or agarose. In other embodiments, the microgels arecomposed of a lipophilic microgel.

In other aspects, the template particles may be a hydrogel. In certainembodiments, the hydrogel is selected from naturally derived materials,synthetically derived materials and combinations thereof. Examples ofhydrogels include, but are not limited to, collagen, hyaluronan,chitosan, fibrin, gelatin, alginate, agarose, chondroitin sulfate,polyacrylamide, polyethylene glycol (PEG), polyvinyl alcohol (PVA),acrylamide/bisacrylamide copolymer matrix, polyacrylamide/poly(acrylicacid) (PAA), hydroxyethyl methacrylate (HEMA), polyN-isopropylacrylamide (PNIPAM), and polyanhydrides, poly(propylenefumarate) (PPF).

In some embodiments, the presently disclosed template particles furthercomprise materials which provide the template particles with a positivesurface charge, or an increased positive surface charge. Such materialsmay be without limitation poly-lysine or Polyethyleneimine, orcombinations thereof. This may increase the chances of associationbetween the template particle and, for example, a cell which generallyhave a mostly negatively charged membrane.

Other strategies may be used to increase the chances of templetparticle-target cell association, which include creation of specifictemplate particle geometry. For example, in some embodiments, thetemplate particles may have a general spherical shape but the shape maycontain features such as flat surfaces, craters, grooves, protrusions,and other irregularities in the spherical shape.

Any one of the above described strategies and methods, or combinationsthereof may be used in the practice of the presently disclosed templateparticles and method for targeted library preparation thereof. Methodsfor generation of template particles, and template particles-basedencapsulations, were described in International Patent Publication WO2019/139650, which is incorporated herein by reference.

Creating template particle-based encapsulations for single cellexpression profiling comprises combining target cells with a pluralityof template particles in a first fluid to provide a mixture in areaction tube. The mixture may be incubated to allow association of theplurality of the template particles with target cells. A portion of theplurality of template particles may become associated with the targetcells. The mixture is then combined with a second fluid which isimmiscible with the first fluid. The fluid and the mixture are thensheared so that a plurality of monodisperse droplets is generated withinthe reaction tube. The monodisperse droplets generated comprise (i) atleast a portion of the mixture, (ii) a single template particle, and(iii) a single target particle. Of note, in practicing methods of theinvention provided by this disclosure a substantial number of themonodisperse droplets generated will comprise a single template particleand a single target particle, however, in some instances, a portion ofthe monodisperse droplets may comprise none or more than one templateparticle or target cell.

In some embodiments, to increase the chances of generating anencapsulation, such as, a monodisperse droplet that contains onetemplate particle and one target cell, the template particles and targetcells are combined at a ratio wherein there are more template particlesthan target cells. For example, the ratio of template particles totarget cells 213 combined in a mixture as described above may be in arange of 5:1 to 1,000:1, respectively. In other embodiments, thetemplate particles and target cells are combined at a ratio of 10:1,respectively. In other embodiments, the template particles and targetcells are combined at a ratio of 100:1, respectively. In otherembodiments, the template particles and target cells are combined at aratio of 1000:1, respectively.

To generate a monodisperse emulsion, the presently disclosed methodincludes a step of shearing the second mixture provided by combining afirst mixture comprising target particles and target cells with a secondfluid immiscible with the first mixture. Any suitable method ortechnique may be utilized to apply a sufficient shear force to thesecond mixture. For example, the second mixture may be sheared byflowing the second mixture through a pipette tip. Other methods include,but are not limited to, shaking the second mixture with a homogenizer(e.g., vortexer), or shaking the second mixture with a bead beater. Insome embodiments, vortex may be performed for example for 30 seconds, orin the range of 30 seconds to 5 minutes. The application of a sufficientshear force breaks the second mixture into monodisperse droplets thatencapsulate one of a plurality of template particles.

In some aspects, generating the template particles-based monodispersedroplets involves shearing two liquid phases. The mixture is the aqueousphase and, in some embodiments, comprises reagents selected from, forexample, buffers, salts, lytic enzymes (e.g. proteinase k) and/or otherlytic reagents (e. g. Triton X-100, Tween-20, IGEPAL, bm 135, orcombinations thereof), nucleic acid synthesis reagents e.g. nucleic acidamplification reagents or reverse transcription mix, or combinationsthereof. The fluid is the continuous phase and may be an immiscible oilsuch as fluorocarbon oil, a silicone oil, or a hydrocarbon oil, or acombination thereof. In some embodiments, the fluid may comprisereagents such as surfactants (e.g. octylphenol ethoxylate and/oroctylphenoxypolyethoxyethanol), reducing agents (e.g. DTT, betamercaptoethanol, or combinations thereof).

In practicing the methods as described herein, the composition andnature of the monodisperse droplets, e.g., single-emulsion andmultiple-emulsion droplets, may vary. As mentioned above, in certainaspects, a surfactant may be used to stabilize the droplets. Themonodisperse droplets described herein may be prepared as emulsions,e.g., as an aqueous phase fluid dispersed in an immiscible phase carrierfluid (e.g., a fluorocarbon oil, silicone oil, or a hydrocarbon oil) orvice versa. Accordingly, a droplet may involve a surfactant stabilizedemulsion, e.g., a surfactant stabilized single emulsion or a surfactantstabilized double emulsion. Any convenient surfactant that allows forthe desired reactions to be performed in the droplets may be used. Inother aspects, monodisperse droplets are not stabilized by surfactants.

FIG. 2 illustrates a droplet 201 according to one aspect of theinvention. The depicted droplet 201 is a single one of a plurality ofmonodisperse droplets generated by shearing a mixture according tomethods of the invention. The droplet 201 comprises a template particle207 and a single target cell 213. The template particle 207 illustratedcomprises crater-like depressions 231 to facilitate capture of singlecells 213. The template particle 231 further comprises an internalcompartment 211 to deliver one or more reagents into the droplet 201upon stimulus.

In some embodiments, the template particles contain multiple internalcompartments. The internal compartments of the template particles may beused to encapsulate reagents that can be triggered to release a desiredcompound, e.g., a substrate for an enzymatic reaction, or induce acertain result, e.g. lysis of an associated target cell. Reagentsencapsulated in the template particles' compartment may be withoutlimitation reagents selected from buffers, salts, lytic enzymes (e.g.proteinase k), other lytic reagents (e. g. Triton X-100, Tween-20,IGEPAL, bm 135), nucleic acid synthesis reagents, or combinationsthereof.

Lysis of single target cells occurs within the monodisperse droplets andmay be induced by a stimulus such as heat, osmotic pressure, lyticreagents (e.g., DTT, beta-mercaptoethanol), detergents (e.g., SDS,Triton X-100, Tween-20), enzymes (e.g., proteinase K), or combinationsthereof. In some embodiments, one or more of the said reagents (e.g.,lytic reagents, detergents, enzymes) is compartmentalized within thetemplate particle. In other embodiments, one or more of the saidreagents is present in the mixture. In some other embodiments, one ormore of the said reagents is added to the solution comprising themonodisperse droplets, as desired.

FIG. 3 illustrates a droplet 201 following lysis of a target cell. Thedepicted droplet 201 comprises a template particle 207 and released mRNA301. Methods of the invention quantify amplified products of thereleased mRNAs 301, preferably by sequencing.

In preferred embodiments, template particles comprise a plurality ofcapture probes. Generally, the capture probe of the present disclosureis an oligonucleotide. In some embodiments, the capture probes areattached to the template particle's material, e.g. hydrogel material,via covalent acrylic linkages. In some embodiments, the capture probesare acrydite-modified on their 5′ end (linker region). Generally,acrydite-modified oligonucleotides can be incorporated,stoichiometrically, into hydrogels such as polyacrylamide, usingstandard free radical polymerization chemistry, where the double bond inthe acrydite group reacts with other activated double bond containingcompounds such as acrylamide. Specifically, copolymerization of theacrydite-modified capture probes with acrylamide including acrosslinker, e.g. N,N′-methylenebis, will result in a crosslinked gelmaterial comprising covalently attached capture probes. In some otherembodiments, the capture probes comprise Acrylate terminated hydrocarbonlinker and combining the said capture probes with a template particlewill cause their attachment to the template particle.

FIGS. 4-6 show an exemplary method for nonspecific amplification of mRNAaccording to certain aspects of the disclosure. In particular, themethod relies on the presence of a poly A tail at the 3′ end of a mRNAfor the non-specific capture of mRNAs.

FIG. 4 illustrates the capture of mRNA 301. Shown, is a templateparticle 201 comprising a plurality of capture probes 401 illustratedschematically by curved broken lines. One of the capture probes 401 isfeatured in a larger scale and in detail. The capture probe 401preferably comprises, from 5′ end to 3′ end, a linker region to allowcovalent bond with the template particle 201, a PR1 471 nucleotidesequence region comprising a universal primer nucleotide sequence, atleast one barcode region B1 473, which may include an index 475nucleotide sequence index, and/or a UMI, the capture probe 201 furtherincluding a capture nucleotide sequence 22 comprising a poly Tnucleotide sequence. A released nucleic acid, i.e., mRNA molecule 301comprising a poly A sequence attaches to the capture probe's poly Tsequence 22 via complementary base pairing. Following the hybridizationof the mRNA molecule 301 and the capture probe 401, a reversetranscriptase is used to perform a reverse transcription reaction tosynthetize cDNA and thereby create a first strand comprising the cDNAand the capture probe sequence.

FIG. 5 illustrates synthesis of cDNA to form a first strand 23. Areverse transcriptase (not shown) synthesizes cDNA from mRNA that ishybridized to a poly T sequence of a capture probe 401. After synthesis,a first strand 23 is formed, wherein the first strand 23 comprises thecDNA and the capture probe 401 sequence. Following synthesis, the mRNAmolecule 301-first strand 23 hybrid may be denatured (not shown) usingany method traditional in the art, such as an exposure to a denaturingtemperature.

FIG. 6 illustrates amplification of a first strand to generate anamplicon. In particular, following the formation of a first strand 23, asecond strand primer 24 comprising a random sequence, such as, a randomhexamer, anneals with the first strand 23 to form a DNA-primer hybrid. ADNA polymerase is used to synthesize a complementary second strand 25,i.e., an amplicon. In the embodiment illustrated, the second strandprimer 24 comprises a “tail” region which does not hybridize with thefirst strand 23. In some embodiments, the tail region comprises a seconduniversal primer sequence. The second strand 25 may be further amplifiedby PCR to generate a plurality of amplicons, and quantified by DNAsequencing.

Amplification or nucleic acid synthesis, as used herein, generallyrefers to methods for creating copies of nucleic acids by using thermalcycling to expose reactants to repeated cycles of heating and cooling,and to permit different temperature-dependent reactions (e.g. bypolymerase chain reaction (PCR). Any suitable PCR method known in theart may be used in connection with the presently described methods. Nonlimiting examples of PCR reactions include real-time PCR, nested PCR,multiplex PCR, quantitative PCR, TS-PCR, or touchdown PCR.

The terms “nucleic acid amplification reagents” or “reversetranscription mix” encompass without limitation dNTPs (mix of thenucleotides dATP, dCTP, dGTP and dTTP), buffer/s, detergent/s, orsolvent/s, as required, and suitable enzyme such as polymerase orreverse transcriptase. The polymerase used in the presently disclosedtargeted library preparation method may be a DNA polymerase, and may beselected from, but is not limited to, Taq DNA polymerase, Phusionpolymerase, or Q5 polymerase. The reverse transcriptase used in thepresently disclosed targeted library preparation method may be forexample, Moloney murine leukemia virus (MMLV) reverse transcriptase, ormaxima reverse transcriptase. In some embodiments, the generalparameters of the reverse transcription reaction comprise an incubationof about 15 minutes at 25 degrees and a subsequent incubation of about90 minutes at 52 degrees. Nucleic acid amplification reagents arecommercially available, and may be purchased from, for example, NewEngland Biolabs, Ipswich, Mass., USA, or Clonetech.

FIGS. 7-9 illustrate a method for sequence-specific amplification ofmRNA according to certain aspects of the disclosure.

FIG. 7 illustrates a method for sequence-specific capture of mRNA 301.The template particle 201 comprises a plurality of capture probes 401illustrated schematically by curved broken lines. A featured captureprobe 401 comprises, from 5′ end to 3′ end, a linker region to allowcovalent bond with the template particle 201, a PR1″ region comprising auniversal primer nucleotide sequence, at least one barcode region B1,which may include an index sequence, and/or a UMI, the capture probe 401further comprising and a capture sequence comprising a gene-specificsequence 26. A molecule of mRNA 301, released inside a monodispersedroplet, comprising a target sequence 481 complementary to thegene-specific sequence 26 attaches to the capture probe's gene-specificsequence 26 via complementary base pairing. The gene-specific sequencemay comprise any sequence of interest, for example, a sequencecorresponding to an oncogene.

For example, in some instances template particles 201 according toaspects of the invention may comprise capture probes with certainsequences specific to genes of interest, such as, oncogenes. Somenon-limiting examples of genes of interest that may be assayed forinclude, but are not limited to, BAX, BCL2L1, CASP8, CDK4, ELK1, ETS1,HGF, JAK2, JUNB, JUND, KIT, KITLG, MCL1, MET, MOS, MYB, NFKBIA, EGFR,Myc, EpCAM, NRAS, PIK3CA, PML, PRKCA, RAF1, RARA, REL, ROS1, RUNX1, SRC,STAT3, CD45, cytokeratins, CEA, CD133, HER2, CD44, CD49f, CD146, MUC1/2,ABL1, AKT1, APC, ATM, BRAF, CDH1, CDKN2A, CTNNB1, EGFR, ERBB2, ERBB4,EZH2, FBXW7, FGFR2, FGFR3, FLT3, GNAS, GNAQ, GNA11, HNF1A, HRAS, IDH1,IDH2, JAK2, JAK3, KDR, KIT, KRAS, MET, MLH1, NOTCH1, NPM1, NRAS, PDGFRA,PIK3CA, PTEN, PTPN11, RB1, RET, SMAD4, STK11, TP53, VHL, and ZHX2.

FIG. 8 illustrates the synthesis of cDNA to form a first strand 23. Areverse transcriptase (not shown) synthesizes cDNA from mRNA that ishybridized to gene-specific sequence of a capture probe 12. Followingthe hybridization of the target mRNA molecule 301 and the capture probe12, a reverse transcription reaction is performed to synthetize a cDNAand create a first strand 23. The first strand 23 comprises synthesizedcDNA and the capture probe 401 sequence. The target mRNA molecule-firststrand hybrid is than denatured using methods traditional in the art(not shown), and second strand primer 24 comprising a random hexamersequence anneals with complementary sequence of the first strand 23 toform a DNA-primer hybrid.

FIG. 9 illustrates amplification of a first strand 23 to generate anamplicon 25. In particular, following the formation of a first strand23, a second strand primer 24 comprising a random sequence, such as, arandom hexamer, anneals with the first strand 23 to form a DNA-primerhybrid. A DNA polymerase is used to synthesize a complementary secondstrand 25, i.e., an amplicon 25. In the embodiment illustrated, thesecond strand primer 24 comprises a “tail” region which does nothybridize with the first strand 23. In some embodiments, the tail regioncomprises a second universal primer sequence.

According to aspects of the present disclosure, the term “universalprimer sequence” generally refers to a primer binding site, e.g., aprimer sequence that would be expected to hybridize (base-pair) to, andprime, one or more loci of complementary sequence, if present, on anynucleic acid fragment. In some embodiments, the universal primersequences used with respect to the present methods are P5 and P7.

The term barcode region may comprise any number of barcodes, index orindex sequence, UMIs, which are unique, i.e., distinguishable from otherbarcode, or index, UMI sequences. The sequences may be of any suitablelength which is sufficient to distinguish the barcode, or index,sequence from other barcode sequences. A barcode, or index, sequence mayhave a length of 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,19, 20, 21, 22, 23, 24, 25 nucleotides, or more. In some embodiments,the barcodes, or indices, are pre-defined and selected at random.

In some methods of the invention, a barcode sequence may comprise uniquemolecule identifiers (UMIs). UMIs are a type of barcode that may beprovided to a sample to make each nucleic acid molecule, together withits barcode, unique, or nearly unique. This may be accomplished byadding one or more UMIs to one or more capture probes of the presentinvention. By selecting an appropriate number of UMIs, every nucleicacid molecule in the sample, together with its UMI, will be unique ornearly unique.

UMIs are advantageous in that they can be used to correct for errorscreated during amplification, such as amplification bias or incorrectbase pairing during amplification. For example, when using UMIs, becauseevery nucleic acid molecule in a sample together with its UMI or UMIs isunique or nearly unique, after amplification and sequencing, moleculeswith identical sequences may be considered to refer to the same startingnucleic acid molecule, thereby reducing amplification bias. Methods forerror correction using UMIs are described in Karlsson et al., 2016,Counting Molecules in cell-free DNA and single cells RNA″, KarolinskaInstitutet, Stockholm Sweden, incorporated herein by reference.

In certain aspects, methods of the invention include combining templateparticles with target cells in a first fluid, adding a second fluid tothe first fluid, shearing the fluids to generate a plurality ofmonodisperse droplets simultaneously that contain a single one of thetemplate particles and a single one of the target cells, in which thetemplate particles preferably include one or more oligos useful intemplate switching oligo (TSO) embodiments. The method preferably alsoincludes lysing each of the single target cells contained within themonodisperse droplets to release a plurality of distinct mRNA molecules;and quantifying the plurality of distinct mRNA molecules by, forexample, using template switching PCR (TS-PCR), as discussed in U.S.Pat. No. 5,962,272, which is incorporated herein by reference. TS-PCR isa method of reverse transcription and polymerase chain reaction (PCR)amplification that relies on a natural PCR primer sequence at thepolyadenylation site, also known as the poly(A) tail, and adds a secondprimer through the activity of murine leukemia virus reversetranscriptase. This method permits reading full cDNA sequences and candeliver high yield from single sources, even single cells that contain10 to 30 picograms of mRNA.

TS-PCR generally relies on the intrinsic properties of Moloney murineleukemia virus (MMLV) reverse transcriptase and the use of a unique TSO.During first-strand synthesis, upon reaching the 5′ end of the mRNAtemplate, the terminal transferase activity of the MMLV reversetranscriptase adds a few additional nucleotides (mostly deoxycytidine)to the 3′ end of the newly synthesized cDNA strand. These bases mayfunction as a TSO-anchoring site. After base pairing between the TSO andthe appended deoxycytidine stretch, the reverse transcriptase “switches”template strands, from cellular RNA to the TSO, and continuesreplication to the 5′ end of the TSO. By doing so, the resulting cDNAcontains the complete 5′ end of the transcript, and universal sequencesof choice are added to the reverse transcription product. This approachmakes it possible to efficiently amplify the entire full-lengthtranscript pool in a completely sequence-independent manner.

FIG. 10 illustrates the capture of mRNA 301 according to TSOembodiments. The TSO 1009 is an oligo that hybridizes to untemplated Cnucleotides added by the reverse transcriptase during reversetranscription. The TSO may add, for example, a common 5′ sequence tofull length cDNA that is used for downstream cDNA amplification. Shown,is a template particle 201 that comprises a first capture probe 401, anda second capture probe 403. The first capture probe 401 preferablycomprises, from 5′ end to 3′ end, a linker region to allow a covalentbond with the template particle 201, a P5 511 nucleotide sequence regioncomprising a universal primer nucleotide sequence, at least one barcode33, and a capture nucleotide sequence 22 comprising a poly T nucleotidesequence. The second capture probe 403 preferably includes a TSO 1009, aUMI 531, a second barcode 541, a P7 543 nucleotide sequence regioncomprising a universal primer nucleotide sequence. A released nucleicacid, i.e., mRNA molecule 301 comprising a poly A sequence attaches tothe first capture probe's 401 poly T sequence 22 via complementary basepairing. Following the hybridization of the mRNA molecule 301 and thecapture probe 401, TS-PCR is performed using a reverse transcriptase,i.e., murine leukemia virus reverse transcriptase, to synthetize cDNAand thereby create a first strand. During TS-PCR amplification, uponreaching the 5′ end of the mRNA template, the terminal transferaseactivity of the reverse transcriptase adds a few additional nucleotides(mostly deoxycytidine), to the 3′ end of the nascent first strand.

FIG. 11 shows a first strand 23 following TS-PCR amplification. Thefirst strand 23 includes additional nucleotides that may function as aTSO-anchoring site 34. The TSO-anchoring site 34 may hybridize with theTSO 1009, after base pairing between the TSO and the TSO-anchoring site34, the reverse transcriptase “switches” template strands, from cellularRNA to the TSO, and continues replication to the 5′ end of the TSO. Bydoing so, the resulting cDNA contains the complete 5′ end of thetranscript, and sequences from the second capture probe 403. aftersynthesis of the first strand 23, the first strand 23 including captureprobes 401, 403, may be released either by cleaving covalent bondsattaching the capture probes 401, 403 to a surface of the templateparticle 201, or by dissolving the template particle 201, for example,by heat.

A person with ordinary skills in the art will appreciate that any one ofthe template particle embodiments, capture probes, primer probes, secondstrand primers, universal amplification primers, barcodes, UMIs, TSOs,and methods thereof described in any one of the embodiments of thepresently disclosed targeted library preparation method may be used in adifferent combination, or embodiment, of the present method. Forexample, any one of the presently described second strand primers, orprimer probe, may be used to prime any one of the presently disclosedfirst strand to allow for a DNA synthesis reaction to generate anamplicon.

In preferred embodiments, quantifying released mRNA comprisessequencing, which may be performed by methods known in the art. Forexample, see, generally, Quail, et al., 2012, A tale of three nextgeneration sequencing platforms: comparison of Ion Torrent, PacificBiosciences and Illumina MiSeq sequencers, BMC Genomics 13:341. Nucleicacid sequencing techniques include classic dideoxy sequencing reactions(Sanger method) using labeled terminators or primers and gel separationin slab or capillary, or preferably, next generation sequencing methods.For example, sequencing may be performed according to technologiesdescribed in U.S. Pub. 2011/0009278, U.S. Pub. 2007/0114362, U.S. Pub.2006/0024681, U.S. Pub. 2006/0292611, U.S. Pat. Nos. 7,960,120,7,835,871, 7,232,656, 7,598,035, 6,306,597, 6,210,891, 6,828,100,6,833,246, and 6,911,345, each incorporated by reference.

The conventional pipeline for processing sequencing data includesgenerating FASTQ-format files that contain reads sequenced from a nextgeneration sequencing platform, aligning these reads to an annotatedreference genome, and quantifying expression of genes. These steps areroutinely performed using known computer algorithms, which a personskilled in the art will recognize can be used for executing steps of thepresent invention. For example, see Kukurba, Cold Spring Harb Protoc,2015 (11):951-969, incorporated by reference.

After obtaining expression profiles from single cells, the expressionprofiles can be analyzed by, for example, comparing the profiles withreference or control profiles to ascertain information about the singletarget cells. For example, see generally, Efroni, Genome Biology, 2015;and Stahlberg, Nucleic Acids Research, 2011, 39(4)e24, each of whichincorporated by reference.

In one aspect, methods and systems of the invention provide a method foridentifying a rare cell from a heterogeneous cell population. The methodincludes isolating a plurality of single target cells from theheterogeneous cell population by combining the heterogeneous cells witha plurality of template particles in a first fluid, adding a secondfluid that is immiscible with the first fluid, and shearing the fluidsto generate an emulsion comprising monodisperse droplets that eachcontain a single target cell and a single template particle. Methodsfurther include releasing a plurality of mRNA molecules from each of thesingle target cells contained within the monodisperse droplets andquantifying the plurality of mRNA molecules. Quantifying may includegenerating a plurality of amplicons of the mRNA molecules wherein eachof the amplicons comprise a barcode or index sequence that is unique tothe cell from which the mRNA molecule was derived. In some instances,methods may include sequencing the plurality of barcoded amplicons by,for example, next-generation sequencing methods to generate sequencereads for each of the amplicons. Methods may further include processingthe sequence reads associated with single cells of the heterogeneouscell population to generate expression profiles for each of the singlecells and using the data by, for example, performing a gene clusteringanalysis to identify one or more cell types or cell states.

In another aspect, methods and systems of the disclosure provide amethod for analyzing a heterogeneous tumor biopsy taken from a subject.The method includes obtaining a biopsy from a patient and isolating apopulation of cells from the biopsy. The method further includessegregating the population of cells into taken from the biopsy intodroplets by combining the population of cells with a plurality oftemplate particles in a first fluid, adding a second fluid that isimmiscible with the first fluid, and shearing the fluids to generate anemulsion comprising monodisperse droplets that each contain a single oneof the population of cells and a single template particle. Methodsfurther include releasing a plurality of mRNA molecules from each one ofthe segregated single cells contained within the monodisperse dropletsand performing transcriptome analysis on one or more genes of the singlecells and using the transcriptome data to identify one or morecharacteristics of the tumor. A characteristic identified can be thepresence, or absence, of one or more gene transcripts associated with acancer. A method disclosed herein can further comprise the step of usingthe characteristic to diagnose a subject with cancer or a cancer stageor to devise a treatment plan.

In some aspects, methods and systems of the invention provide a methodfor determining the potential effectiveness of a therapeutic agent. Themethod comprises segregating a first population of diseased cells intomonodisperse droplets with template particles and determining theexpression level of at least one nucleic acid from at least one of thediseased cells, thereby producing a disease-state expression signature.The method further includes exposing a second population of diseasestate cells to an agent and determining the expression level of at leastone nucleic acid from at least one of the individual cells from thesecond population and comparing the expression level from the individualcell from the second population to the disease-state expressionsignature to thereby determine the effectiveness of the agent againstthe disease. In some embodiments, the therapeutic agent may be deliveredto second population of cells inside monodisperse droplets. For example,the agent may be associated with the template particle by tethering theagent to an external surface of the template particle, or packaging theagent inside a compartment of the template particle such that the agentcan be delivered to the cells contained inside the monodispersedroplets.

In any one of the embodiments of the presently disclosed targetedlibrary preparation method, the template particle further comprises acapture moiety. In some embodiments, the capture moiety acts to capturespecific target particles, for example, specific types of cells. In someembodiments, the capture moiety comprises an Acrylate-terminatedhydrocarbon linker with biotin termination. In some embodiments, thecapture moiety is attached to a target-specific capture element. In someembodiments, the target-specific capture element is selected fromaptamers and antibodies. Embodiments of the capture moiety and methodsthereof are disclosed in world application WO2020069298A1, incorporatedherein by reference.

1-21. (canceled)
 22. A method comprising: combining template particleswith cells in a first fluid wherein the template particles are linked tocopies of a first capture probe comprising a poly-T sequence and asecond capture probe comprising a template-switching segment; generatingdroplets of the first fluid, in a second immiscible fluid, wherein atleast one droplet contains a single template particle and a singletarget cell; breaking the droplets; lysing the target cell to release aplurality of distinct mRNA molecules in the droplet; capturing at leastone mRNA molecule in the droplet with the poly-T sequence of the firstcapture probes of the single template particle; reverse transcribing themRNA molecule to yield a cDNA using a reverse transcriptase that addsadditional bases to a 3′ end the cDNA; annealing the additional bases ofthe cDNA to the template-switching segment of the second capture probe;and copying, by the reverse transcriptase, the annealed second captureprobe into the cDNA.
 23. The method of claim 22, wherein the copyingstep leaves the cDNA linked to bead, the cDNA comprising first andsecond universal primer binding sites, at least one cell barcode, and atleast one unique molecular identifier (UMI).
 24. The method of claim 22,wherein the template-switching segment comprises a plurality of ribo-Gs.25. The method of claim 22, wherein the additional bases added by thereverse transcriptase comprise CCC.
 26. The method of claim 22, whereinthe second capture probe comprises, from 5′ to 3′ end: a covalent bondwith the template particle; a first universal primer nucleotidesequence, a first barcode, and a capture sequence that includes thepoly-T nucleotide sequence.
 27. The method of claim 22, wherein thesecond capture probe comprises one or more of the template switchingsegment, a UMI, a second barcode, and a second universal primernucleotide sequence.
 28. The method of claim 22, wherein the generatingstep comprises adding the second fluid to the first fluid in a tube, andvortexing the tube to shear the first and second fluids, causing thedroplets to form simultaneously and monodisperse.
 29. The method ofclaim 22, further comprising quantifying the plurality of distinct mRNAmolecules and optionally generating an expression profile for the cellsafter quantifying the plurality of distinct mRNA molecules.
 30. Themethod of claim 29, wherein the first fluid is an aqueous fluid and thesecond fluid comprises an oil.
 31. The method of claim 30, wherein thetemplate particles further comprise one or more compartments.
 32. Themethod of claim 31, wherein the one or more compartments contain areagent selected from a group comprising a lytic reagent, a nucleic acidsynthesis reagent, or combination thereof.
 33. The method of claim 32,wherein the nucleic acid synthesis reagent comprises a murine leukemiavirus reverse transcriptase.
 34. The method of claim 22, furthercomprising amplifying the cDNA by PCR to generate amplicons.
 35. Themethod of claim 34, wherein amplicons are sequencing and counted usingUMIs to deduplicated the amplicons.
 36. The method of claim 34, whereinsaid amplifying step comprises adding primer binding sites comprisingcapture probes at 3′ and 5′ ends of each cDNA.
 37. The method of claim34, wherein the amplifying step comprises using a universal primer siteon the second capture probe and sets of gene-specific primers.
 38. Themethod of claim 22, wherein the breaking step occurs after the capturingstep.
 39. The method of claim 22, wherein the breaking step occurs afterthe reverse transcribing step.